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We develop a testing procedure for distinguishing between a long- 
range dependent time series and a weakly dependent time series with 
change-points in the mean. In the simplest case, under the null hy- 
' pothesis the time series is weakly dependent with one change in mean 

, at an unknown point, and under the alternative it is long-range 

dependent. We compute the CUSUM statistic T n , which allows us 
to construct an estimator k of a change-point. We then compute 
the statistic T Ui i based on the observations up to time k and the 
statistic T n< 2 based on the observations after time k. The statistic 
^ \ M n = max[r n ,i,r n ,2] converges to a well-known distribution under 

C~) . the null, but diverges to infinity if the observations exhibit long-range 

dependence. The theory is illustrated by examples and an application 
to the returns of the Dow Jones index. 

o 

■ 1. Introduction. The present paper develops a testing procedure for dis- 
tinguishing between a long-range dependent time series and a weakly de- 

i -^h ' pendent time series with change-points in the mean. 

& • Many geophysical time series records have long been known to exhibit 

long nonperiodic cycles or persistent deviations from the mean. In the mid- 
1960s Mandelbrot and his collaborators proposed the use of self-similar pro- 

• rH . cesses, most notably fractional Brownian motion, to model such records; 

rS 

■ Received July 2003; revised July 2005. 
Supported by NSF Grant INT-0223262 and NATO Grant PST.EAP.CLG 980599. 
Supported in part by OTKA Grants T 037886 and T 043037. 
Supported in part by NSF Grant DMS-04-13653. 

Supported in part by NSF Grant DMS-01-03487, and Grants R-146-000-038-101 and 
R- 1555-000-035- 112 at the National University of Singapore. 
AMS 2000 subject classifications. 62M10, 62G10. 

Key words and phrases. Change-point in mean, CUSUM, long-range dependence, vari- 
ance of the mean. 



This is an electronic reprint of the original article published by the 
Institute of Mathematical Statistics in The Annals of Statistics, 
2006, Vol. 34, No. 3, 1140-1165. This reprint differs from the original in 
pagination and typographic detail. 

1 



2 



BERKES, HORVATH, KOKOSZKA AND SHAO 



see, for example, [33]. Over a decade later, Granger and Joyeux [21] and 
Hosking [25] (see also [1]) introduced fractional ARIMA processes, which 
are approximately self-similar and offer a much greater modeling flexibility 
If their fractional differencing parameter d satisfies < d < 1/2, these pro- 
cesses are stationary and possess long-range dependence, or long memory, 
in the sense that the autocovariance function is not absolutely summable 
(decays like fc -1 , as the lag k —> oo). In the 1980s there was substantial 
interest in using long memory processes to model macroeconomic time se- 
ries, whereas, in the 1990s, the focus shifted to modeling the volatility of 
returns on speculative assets by such processes; an in depth discussion and 
relevant references are provided in [23]. Following the pioneering work of 
Leland et al. [30] and Paxson and Floyd [38] , self-similar processes have also 
increasingly been used to model certain aspects of computer network traffic; 
see [36]. There are many other fields where models exhibiting long-range 
dependence have been used; see [14] for a recent extensive review. 

Even though modeling certain time series in the aforementioned fields 
by means of long-range dependent processes has become quite widespread, 
especially in geophysics, it is clear that a series with long periods where 
the observations are away from the mean can also naturally be modeled 
by a nonstationary process whose mean changes. Bhattacharya, Gupta and 
Waymire [9] used mathematical arguments to show that the so-called Hurst 
effect, which motivated Mandelbrot and his collaborators to advocate the 
use of self-similar processes, can also be explained if the observations Xj~ are 
assumed to follow the model Xk = + f(k), where Yfc is a weakly depen- 
dent stationary process and / is a deterministic function. That research was 
elaborated on by Giraitis, Kokoszka and Leipus [16] who showed that several 
statistics akin to the modified R/S statistic of Lo [31] diverge to infinity un- 
der either long-range dependence or weak dependence with change-points. 
In a similar spirit, Diebold and Inoue [13] argued that the appearance of 
long memory can be explained by some econometric models which involve 
changes in their defining parameters. Mikosch and Starica [34, 35] asserted 
that what had been seen by many as long memory in the volatility of returns 
is, in fact, a manifestation of changes in the parameters of the underlying 
GARCH-type models. In the context of network traffic, similar findings are 
reported in [26]. The above list of references is not exhaustive, but it empha- 
sizes that it is difficult to distinguish a truly long-range dependent process 
from a process with some form of nonstationarity, including shifts in mean. 
Standard tools like ACF plots and periodogram-based spectral estimates 
behave in a very similar way under these two alternatives. There are also a 
number of long memory tests designed to test the null hypothesis of weak de- 
pendence against an alternative of long-range dependence and change-point 
tests developed to test the same null hypothesis but against a change-point 
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alternative. Most long memory tests reject in the presence of change-points 
and many change-point tests reject in the presence of long memory 

The answer to the question of which approach to use will often depend on 
a specific application at hand. A long-range dependent process may, for ex- 
ample, provide a parsimonious description of a long, possibly nonstationary, 
time series. On the other hand, to construct short term forecasts of a possi- 
bly self-similar process, it might be advisable to fit an ARMA model to the 
most recent stretch of data after the last estimated change-point. In many 
applications, however, such as, for example, constructing long term forecasts, 
it does matter which model better fits the data. We refer to [10] and [28] 
for some relevant financial applications. Formal statistical tests which would 
help decide if a particular time series is better described as a realization of a 
long-range dependent process or as a realization of a weakly dependent pro- 
cess with change-points are therefore of value. There has, however, not been 
much research in this direction. Kiinsch [29] proposed a periodogram based 
procedure to discriminate between a long-range dependent process and the 
process = Yj~ + f(k) with a monotonic function / and Gaussian weakly 
dependent Yf.. Heyde and Dai [24] showed that procedures for detecting long 
memory which are based on a smoothed periodogram are robust in the pres- 
ence of small trends. These ideas were recently developed by Sibbertsen and 
Venetis [42] who proposed a test based on a difference between the Geweke 
and Porter-Hudak [15] estimator of d and its version based on the tapered 
periodogram. 

A main objective of the present paper is to develop the theory underlying 
a test procedure for discriminating between long-range dependence and weak 
dependence with change-points in mean. The proposed test is a simple time 
domain procedure based on a CUSUM statistic for the partial sums, which 
is perhaps the most extensively used statistic for detecting and estimating 
change-points in mean. To describe the idea, suppose that, under the null 
hypothesis, the time series is weakly dependent with one change in mean 
and under the alternative, it is long-range dependent. Consider the CUSUM 
statistic T n defined by (3.1). Using T n , we can construct an estimator k of the 
change-point (no matter if a change-point exists or not). We then compute 
the statistic T Ht i based on the observations up to time k and the statistic T n ^ 
based on the observations after time k. The statistic M n = max[T nj i, T n ^] 
converges to a well-known distribution under the null (cf. Corollary 2.1), 
but diverges to infinity under the alternative. 

Our theory uses the almost sure asymptotics for the Bartlett variance 
estimator stated in Theorem A.l which was established in [8]. For a 
weakly dependent process, is an estimator of the variance of the sample 
mean or of the spectral density at frequency zero. Estimators of this type 
have been extensively studied in the time series literature in the last half 
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century and go back to the work of Bartlett [5] , Grenander and Rosenblatt 
[22] and Parzen [37]. Andrews [3] provides a more recent perspective. As far 
as we know, all consistency results pertaining to the class of kernel estimators 
such as establish convergence in an L p norm or in probability. Such results 
might possibly be applied in our context after some additional technical 
work, but we are not aware of any convergence in probability results which 
would allow us to establish our main results, Theorems 2.1 and 2.2, under 
weaker conditions. Moreover, almost sure convergence offers a convenient 
approach based on the observation that if Z n ^4' and k n oo, then Z kn —> 
[see, e.g., the argument justifying (B.ll)]. 

The paper is organized as follows. In Section 2 we formulate the assump- 
tions, describe the testing procedure in a simple illustrative situation and 
state the relevant theorems. Section 3 discusses the broader applicability 
of the procedure, provides some additional background and examples and 
concludes with an application to returns of the Dow Jones index. The ap- 
pendices contain the proofs. 

2. Assumptions and the testing procedure. To focus attention and lighten 
the notation, we concentrate in this section on a situation where the obser- 
vations can either follow a model with one change in the mean of weakly 
dependent time series or are long-range dependent. In Section 3 we explain 
how the proposed procedure can be used in a situation when there is an 
upper bound on the number of possible changes in the mean. 

The observations Xi follow a change-point model if 



In (2.1) k* is the unknown time of a possible change in mean, and the means 
[i and fx + A are also unknown. The sequence {Yi} is assumed to have mean 
zero and to be weakly stationary in a sense made precise by Assumption 2.1. 
Recall that, for a fourth-order stationary sequence {Y k } with mean and 
7j = Cov(Yo,Yj), the fourth-order cumulant is defined by 

(2.2) n(h, r, s) = E[Y k Y k+h Y k+r Y k+s ] - (jhlr-s + Irlh-s + lslh-r)- 

Assumption 2.1. The sequence {Y k } is fourth-order stationary with 
mean and autocovariance function •jj = Gov(Yo,Yj), and the following 
conditions hold: 





1 < % < k*, 
k* <i<n. 



(2.3) 



n 



1/2 





l<j<nt 



for some a > and 



(2.4) 
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(2.5) sup^J \n(h,r, s) \ <oo. 



Remark 2.1. By the Skorokhod-Wichura-Dudley representation (see, 
e.g., [41]), condition (2.3) is equivalent to the following condition: There are 
Wiener processes W n (t),t 6 [0, 1], such that 



(2.6) sup 

0<t<l 



n ' 1/2 E Yj-aW n {t) 



o P (l). 



l<j<nt 

Condition (2.6) is often more convenient to refer to in the proofs. 

We now make precise the statement that the observations {A,} are long- 
range dependent. In the following Wjj{p) stands for the fractional Brownian 
motion with parameter H, that is, a Gaussian process with mean zero and 
covariances 

E[W H (t)W H (s)} = (t 2H + s 2H -\t- s\ 2H )/2. 

Ifl/2<i/<l, the increments of the fractional Brownian motion are long- 
range dependent. It is convenient to identify the self-similarity parameter H 
with the differencing parameter d introduced in Section 1 via the relation 
H = d+ 1/2 because the increments of Wh, which form a stationary process, 
have the same rate of decay of the autocovariance function as a fractional 
ARIMA with d = H - 1/2; see, for example, Section 7.13 of [40]. In con- 
dition (2.9) of Assumption 2.2 below, and throughout the paper, a,j ~ bj 
means that Hindoo aj/bj = 1. 

Assumption 2.2. The sequence {Xj} is fourth-order stationary with 
fi = EXj and jj = Cov(Xo,Xj) and satisfies the following conditions: 

(2.7) 4f E (Xj-^ScHWnit) in £>[(), 1] 
for some ch > and 

(2.8) \<H<1. 
Moreover, 

(2.9) 7 ; ~ c j 2H ~ 2 
for some cq > 0, and the cumulants (2.2) satisfy 

(2.10) sup \K{h,r,s)\=0{n 2H ~ 1 ). 

' l —n<r,s<n 
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The cumulant condition (2.5) is weaker than the traditional condition 
in which sup^ is replaced by J2h'i see i f° r example, [2] and [3]. Condition 
(2.10) is a natural counterpart of (2.5) and holds for the extensively used 
fractional ARIMA models. For these models, the range (2.8) corresponds to 
< d < 1/2. We do not consider —1/2 < d < because realizations of such 
processes do not exhibit apparent shifts in mean. 

We wish to test 

Ho: The observations X\, . . . ,X n follow the change point model (2.1) with 
the Yi satisfying Assumption 2.1 

against 

Ha- The observations Xi,... ,X n are long-range dependent, that is, satisfy 
Assumption 2.2. 

In order to define the test statistic, we first introduce a change-point 
estimator, 



k = min< k : max 

Ki<n 



^ / Xj ^ 1 X, 



!<?'<» 



1<J<" 



(2.11) 

Next we define the statistics 



(2.12) 



T n: i = k 1 I 2 max 

Sn,l Kk<k 



E E * 



i<i<fc 



E Xi-j E x 



Ki<k 



Ki<k 



based on X\ , . . . , X? and 

1 * -1/2 

(2.13) T n 2 = (n — k) max 

Sn,2 k<k<n 



E x 

k<i<k 



k-k 



E * 



n—k- 

k<i<n 



based on X^ +1 , . . . ,X n . In (2.12) and (2.13), s n ,i and s n ^ are equal to the 
Bartlett estimator computed, respectively, from X±, . . . , X^ and A 
Specifically, setting 



At 



I E x 



1 

n — k 



E x 



k<i<n 



k+v ->X n - 



and 
(2.14) 

we have 



ujj(q) = 1 



„2 
s ra,l 



E {x-x % f 



Ki<k 
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(2.15) 



s n,2 



(2.16) 



+ 2 ">Mk))\ E (Xi-XkXXi+i-Xk)' 

i<i<?(fe) i<j<fe-j 

k<i<n 

l<j<q(n—k) k<i<n—j 



The test statistic is defined as 

(2.17) M n = max{r nil ,r n;2 }. 

We first derive the asymptotic distribution of M n under Ho. We need to 
impose additional assumptions on the change point-model (2.1): both k* , 
the time of change and A, the size of the change, depend on the sample size 
n such that 

(2.18) k* =[n6] for some < 6 < 1, 

(2.19) nA 2 -kx>, 

(2.20) A 2 |fc-F| =0 P (l). 

Condition (2.20) is known to hold if the observations are uncorrelated and 
was extended by Bai [4], Proposition 3, to moving averages driven by white 
noise. It also holds if the process Yi in the change point model (2.1) is 
strictly stationary, satisfies the approximation condition (2.6), A — > and 
(2.19) holds; see Theorem 4.1.4 in [11]. [There is a misprint in that theorem 
and 7 = 0, which corresponds to our statistic T n , should be included in 
part (i). The tail condition (4.1.9) in [11] is not needed because it is used 
only for 7 > 0.] Since the squares of ARCH(oo) processes satisfy (2.6) (see 
Theorem 2.1 in [16]), (2.20) holds for such processes. 

We will also often impose the following condition on the bandwidth q(n): 

(2.21) q(n)A 2 = 0(l). 

Theorem 2.1. Suppose H and (2.18)-(2.21) hold. Suppose q{n) is non- 
decreasing and satisfies 



q(2 k + 

(2.23) q(n) -^00 and q(n) (log n) 11 = 0(n). 



(2.22) sup l^Z <00 
fc>o <?(2 fe ) 
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Then 

(T n>1 ,T ni2 )S( sup \B<V(t)\, sup \B^(t)\ 
\0<K1 0<t<l 

where and B^ are independent Brownian bridges. 
Theorem 2.1 is proved in Appendix B. 

Corollary 2.1. Under the assumptions of Theorem 2.1, we have 

M n 4max( sup \B^(t)\, sup |^ (2) (t)| |- 
Lo<t<i o<t<i J 

Since the distribution function of sup 0<t<1 is known (cf. Section 1.5 

of [12]), the limit distribution in Corollary 2.1 can be computed explicitly. 

In order to describe the asymptotic behavior of the vector (T ni i,T n) 2) if 
the observations Xi are long-range dependent, we define 



and 
(2.24) 



B H {t) = W H {t)-tW H {l) 



£ = inf(i>0:|£ H (i)|= sup \B H {s)\\. 

I 0<s<l J 



Theorem 2.2. Suppose Ha holds. Assume q(n) is nondecreasing, sat- 
isfies (2.22) and 

(2.25) q(n)-> oo and q{n) = 0(n(logn)- 7/ ^- 4H) ). 

Then, the sequence of random vectors 



n 



H-l/2 



'-n,l, 



q(n — k) 



n 



H-l/2 



T, 



n,2 



converges in distribution to the random vector 



1 

—rf sup 

V4 o<t<( 



W H (t) - Vh(0 



n j SU P 

v 1 - ? t<t<\ 



(W H (t) - W H (0) 



(W H (1) - W H (0) 



Theorem 2.2 is proved in Appendix C. 

Theorem 2.2 implies that T nj i and T n ^ tend to infinity in probability. 
Consequently, the test statistic M n tends to infinity in probability under 
Ha- 
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3. Discussion and examples. One of the most often used statistics for 
testing the null hypothesis A = in the change-point model (2.1) is the 
CUSUM statistic 



(3.1) T n = — 7= — max 

n 1 '^S rl l<k<n 



E E * 

Ki<k Ki<n 



where is a suitable estimator of the variance of the sample mean of the 
Xi. If the Yi in (2.1) are independent identically distributed, can be taken 
to be the sample variance. In this paper we allow the Yi to be dependent 
and consider the estimator 

(3.2) 4 = 70 + 2 "M">))% 

l<j<q(n) 

where 

(3.3) %- = ± J2 (Xi-X n )(X i+j -X n ) 

l<i<n—j 

are the sample autocovariances and u>j(q) are the Bartlett weights defined 
by (2.14). 

If the observations are weakly dependent (with no change in the mean), 

the statistic T n converges to the supremum of a Brownian bridge. However, 
P 

T n — ► oo either if there is a shift in mean or if the observations are long-range 
dependent. The latter case is often referred to as a spurious rejection of the 
null hypothesis of no change in mean. We formalize these observations in 
Theorems 3.1, 3.2 and 3.3 which, together with Theorems 2.1 and 2.2, form 
a theoretical foundation for the multistage testing procedure described later 
in this section. In Theorems 3.1, 3.2 and 3.3, it suffices to assume that 

(3.4) g(n) — > oo and q(n)/n— >0 as n — > oo. 

Theorem 3.1. Suppose observations X\, . . . ,X n follow model (2.1) with 
A = 0. If Assumption 2.1 and (3.4) hold, then 

T n S sup \B(t)\, 

0<t<l 

where {£>(£), < t < 1} is a Brownian bridge. 

Proof. Theorem 3.1 (i) in [18] implies that if the observations satisfy 
Xi = fj, + Yi with the Yi satisfying Assumption 2.1 and if (3.4) holds, then 

(3.5) s n ->■ a, 

where a is the asymptotic standard deviation appearing in condition (2.3). 

□ 
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Theorem 3.2. Suppose the observations X±, . . . ,X n follow model (2.1). 
If Assumption 2.1, (3.4), (2.18)-(2.21) hold, thenT n oo. [Assumption (2.19) 
implies that A ^ 0.] 



Theorem 3.2 is proved in Appendix D. 



Theorem 3.3. Suppose the sequence {X^} satisfies Assumption 2.2. If 
q{n)/n — > 0, then 

(3.6) (?W) H T n A sup \W H (t)-tW H {l)\. 
\ n J o<t<i 

[Convergence (3.6) implies T n —> oo .] 

Proof. By Theorem 3.1 in [18], if (2.9), (2.10) and (3.4) hold, then 

In <-?\ I \l-2H 2 P 2 c 

(3.7) q(n) s n ^c H ~ 



H(2H-1) 



The constants ch and cq in (3.7) are the same as, respectively, in (2.7) and (2.9). 
Theorem 3.3 now follows immediately from (2.7) and (3.7). □ 



In order to focus on essential arguments, we considered in Section 2 a 
simple testing problem. In some applications, however, the presence of more 
than one change-point may be suspected. Our test can be extended to a 
multistage testing procedure which is applicable in situations when there 
is an upper bound on the number of possible change-points. The latter as- 
sumption is often used in change-point analysis; see, for example, [44] and 
references therein. For example, in time series of daily returns on market 
indices over a period of ten years, or in temperature series over periods of 
300 years, one suspects at most two or three change-points; see Section 3 
for a data example. For such time series, the maximum number of change 
points in mean can typically be readily established by a visual inspection of 
a time series plot. 

Before describing the procedure, we must introduce additional notation. 
Denote by T(l,m) the CUSUM statistic T n (3.1) computed from the ob- 
servations X[ + i, . . . , X m and by k(l,m) the change-point estimator (2.11) 
computed from the same observations. Let B^ u \u= 1,2,..., be indepen- 
dent Brownian bridges. Define the critical value c{u) by 

Wmax^ sup \B w (t)\,..., sup \B {u \t)\\ > c(u) I = a. 
\ lo<t<i o<t<i J / 



LONG-RANGE DEPENDENCE AND CHANGES IN MEAN 



11 



As mentioned earlier, the distribution of sup <i<i |-B^^(i)| is known and is 
tabulated in [27], so c(u) can be found directly from 

p( sup \B {1 \t)\ < c(u)) = (1 - a) 1/u . 
Vo<*<i / 

The procedure we recommend is based on the binary segmentation method 
of [43]. To focus attention, suppose there can be at most two changes in 
mean, that is, we want to determine if the observations are weakly depen- 
dent with none, one or two changes in the mean or whether they contain 
a long-range dependent stretch of data. If T n = T(0,n) < c(l), the obser- 
vations are weakly dependent. If T n > c(l), we compute k\ := k(0,n) and 
M.2 = max[T(0, ki),T(ki,n)]. If M2 < c(2), the observations are weakly de- 
pendent with one change-point. If M2 > c (2), we compare T(0, k\) and 
T(k\,n). Suppose that T(0, k\) < T(k\,n). We then compute k\ = k(k\,n) 
and 

M 3 = max[r(0 ) fci) ) r(fci,fc 2 ),r(fc2 > n)]. 
Extending Theorem 2.1 to the case of exactly two changes, we have 

M 3 ^max( sup |B W (t)|, sup \B<®(t)\, sup |^ (3) (*)l 1 - 
lo<t<i o<t<i o<t<i J 

Thus, if M3 < c(3), the observations are weakly dependent with two change- 
points. If M3 > c(3), the observations contain a long-range dependent stretch 
of data. 

Before concluding this section with a data example, we list several time 
series models which satisfy Assumptions 2.1 or 2.2. References to the proofs 
can be found in [16]. 

Example 3.1. The linear process 

(3.8) X k = ^2 a j e k-j, 

j 

where £j are independent identically distributed random variables with finite 
fourth moment and zero mean, satisfies Assumption 2.1 if J2j \ a j\ < 00 • I 11 
particular, ARMA processes whose autoregressive polynomial has no zeros 
on the unit circle satisfy Assumption 2.1. 

If, on the other hand, a,- ~ cj d ~ l for some < d < 1/2, then the (3.8) 
satisfy Assumption 2.2 with H = d + 1/2. In particular, fractional ARIMA 
processes whose autoregressive polynomial has no zeros on the unit circle 
satisfy Assumption 2.2. 
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Example 3.2. Consider the process {r]k} satisfying 

(3.9) r) k = p k £ k , p k = CjVk-j, 

where a > 0, Cj > and the ^ are independent identically distributed non- 
negative random variables with finite fourth moment. The rj k should be 
viewed as the squares of an ARCH process. If 

(3.10) [£tf] 1/4 £<*<i. 

i>i 

then the sequence Y k = r\ k — Er] k satisfies Assumption 2.1. 

As a more specific example, consider Y k = r\ — Er 2 , where the r k follow 
a GARCH(p,g) model, 

(3.11) r k = a k e k , a 2 k = u+ £ air 2 k _ { + £ fool-,- 

l<i<p i<i<<? 

Then, under regularity conditions derived in [7], the Cj are defined by 

Ei J2i<i< P a i zl , , 

c j z = i ^= 3— 7> \ z \ ^ !) 

j>l *-~ 2^1<j<qPjZ J 

and p fc = a|,^ = e^- 
Ex ample 3.3. The r k are said to follow a LARCH (Linear ARCH) 
model if 

(3.12) r k = a k e k , a k = a + ^bjr^j, 

where a ^ 0, the bj are real coefficients (not necessarily nonnegative) and the 
e k are independent identically distributed with zero mean and finite fourth 
moment. If bj ~ cj^ 1 for some < d < 1/2 and 

L[Est]^b]<l, 

where L = 7 if the e k are Gaussian and L = 11 in general, then Y k = r\ 
satisfy conditions (2.7) and (2.9) of Assumption 2.2. Conditions for (2.10) 
to hold have not been established yet. The LARCH model was studied by 
Robinson [39], Giraitis et al. [17, 19, 20] and Berkes and Horvath [6], among 
others. 



We conclude this section with an illustration of how our procedure can 
be applied in practice. Figure 1 shows daily returns of the Dow Jones Indus- 
trial Average from January 1, 1992 to December 31, 1999 and a simulated 
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LARCH process with H = 0.85 of the same length (n = 2021). The corre- 
sponding columns show the sample autocorrelation functions and smoothed 
periodograms of the squares of the two series in the top row. The volatility 
(variance) of the Dow Jones series appears to have a change point somewhere 
in the middle of series, but given that we observe only a finite realization, 
this change-point might be spurious and the observed change in variance 
might be explained as a persistent increase in volatility characteristic of 
a long memory process. That this might well be the case is reinforced by 
the examination of the plot of the simulated LARCH series which exhibits 
markedly higher variability in the first 1/3 of the realization, even though 
the plot shows a realization of a strictly and fourth-order stationary pro- 
cess. The left column in Figure 1 shows that the autocorrelation function 
of the squared Dow Jones returns does not decay to zero in a fashion typi- 
cal of a short memory process and the smoothed periodogram (on a log-log 
scale) exhibits a clear positive slope. In fact, a periodogram-based semipara- 




FlG. 1. Daily returns of the Dow Jones Industrial Average and a simulated LARCH pro- 
cess with H — 0.85 together with the autocorrelation functions and smoothed periodograms 
at low frequencies of the squared observations. 
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u 


«1 


/3i 


Before k 


0.02461474 


0.06404848 


0.87864088 


After k 


0.09540076 


0.09734341 


0.83945713 



metric estimate of H based on the automatic bandwith selection procedure 
proposed by Lobato and Robinson [32] yields the estimate H = 0.842991. 

For the Dow Jones returns, we therefore wish to test the null hypothesis of 
exactly one change in the variance of the observations against the alternative 
that the squared observations are a realization of a long-range dependent 
process. Assuming that the mean of the returns is zero (we subtracted the 
sample mean of 0.05829482 before conducting further analysis), this test- 
ing problem is thus identical with the basic testing problem formulated in 
Section 2, with the being equal to the squared returns. 

In order to perform the test, we need to choose the bandwidth function 
q(-). We performed our calculations in Splus and used the function acf to 
obtain sample autocovariances. By default, this function returns the first 
101og 10 (n.) sample autocovariances for a time series of length n. We found, 
however, that, for the nonlinear return data, more autocovariances must be 
used to capture the dependence structure, so we increased the maximum 
lag up to which the autocovariances are computed by 50%. Thus, in the 
following, we report the results based on 

q(n) = 151og 10 (ra). 

The value of M n is 1.341153, which lies below the 10% asymptotic critical 
value of 1.36 (the 5% and 1% critical values are, resp., 1.48 and 1.72). We 
are thus unable to reject the null hypothesis of a change-point in the level 
of the squared returns. 

To validate the above conclusion, we need to assess the empirical size 
and power of the test. To assess the size, we divided the data into two 
parts: before and after the estimated change point k = 1061 and fitted the 
GARCH(1, 1) model to each stretch of data. We obtained the following 
parameters [see (3.11)]: 

Using Erf, = uj/(1 — ol\ — f3±), the implied change in the variance (level of 
the rf) is 1.080022. In fact, variances implied by the GARCH(1, 1) models 
before and after k are very close to the corresponding sample variances whose 
difference is 1.040886. 

We simulated one thousand replications of the above change point model 
and on each of them we computed the value of M n . Table 1 reports the 
percentage of rejections of the null hypothesis. At the nominal confidence 
level of 10%, the percentage of rejections is slightly over 10%, suggesting 
that accepting the null hypothesis based on the value of M n = 1.34115 was 
not due to type II error. 
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To assess the power, we simulated one thousand replications of the LARCH 
process (3.12) with d = 0.35 and the bj computed according to the recursion 
bj = [bj-i(j + d)]/ (j + 1), with &o = 0.25 and a = 0.03. These parameter val- 
ues ensure that the process is fourth-order stationary and were chosen by 
experimentation to make the realizations similar to the Dow Jones returns, 
with a typical realization shown in Figure 1. Table 1 shows that the test 
is able to detect the alternative at the nominal 10% level with probabil- 
ity of over 30%. For this particular alternative, the power is not very high. 
This can be explained by the fact that the realizations of a LARCH process 
with the parameters chosen above and for the sample size of n = 2021 often 
exhibit two periods of different variability which can by separated by the 
change-point estimator k. The intensity of long-range dependence in each of 
the two subsamples is "underestimated," yielding small values of M n . How- 
ever, even though the alternative is "very close" to the null, the test has 
nontrivial power. 

The above illustration is not meant as a guide for practitioners, but merely 
points out the potential of the test. 



Almost sure convergence of the Bartlett estimator. For ease of refer- 
ence, we present here the result on the almost sure asymptotics for the 
estimator , which we appeal to in the folllowing. Its proof is given in [8] . 

Theorem A.l. Suppose {Yf~} is a fourth- order stationary sequence with 
EYi = and jj = Cov(Yq, Y^). Consider the variance estimator 



where jj are the sample autocovariances and ujj(q) are the Bartlett weights 
defined respectively by (3.3) and (2.14). 

Suppose the sequence q(n) is nondecreasing and 



APPENDIX A 



(A.l) 



4 = 7o + 2 w i (<?("■) H 



l<j<q(n) 



(A.2) 




< oo. 



Table 1 

Empirical size and power of the asymptotic test 
based on the statistic M n 



Nominal level (in %) 



10.0 



5.0 



1.0 



Empirical size 13.4 6.5 0.8 

Empirical power 32.5 20.0 5.0 
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(i) Suppose, in addition, that conditions (2.4) and (2.5) hold and 
(A. 3) q(n) — > oo and q(n) (log n) 4 = 0(n). 

Then 

oo 

(A.4) s 2 n ^a 2 := £ 7j a.s. 

j=-oo 

(ii) Assume 

(A.5) \<H<1 
and 

(A.6) 7fc - cok 2H - 2 

for some cq > 0. Assume also that 

(A.7) q(n) ^ oo and q{n) = 0(n(logn)~ 7 /( 4 - 4 ^) 

and 

(A.8) sup ^ \K(h,r,s)\ = 0(n 2H ~ 1 ). 

\h\<q(n) _ n < rjS < n 

T/ien 

(A.9) «W 1 ~ 2Ha *^ ( & = H(2%-l) ^ 

Remark A.l. By the fourth-order stationarity of the Xi, all bounds in 
the proof of Theorem A.l remain valid if the random variables Xi, . . . ,X n 
are replaced by X k +i, ■ ■ ■ , X n , n is replaced by n — k and q(n) is replaced by 
q(n — k). Therefore, on denoting X k = J2k<i<n-^-i an< ^ 



il —u E (*< - *k) 

(A.10) 



„2 

n — k, 

k<i<n 



+ 2 J2 uMn- k ))-^T E (X l -X k )(X i+j -X k ), 

Tt hj 

l<j<q{n—k) k<i<n 

under the assumptions of part (i) of Theorem A.l, 

(A. 11) sin—to- 2 as n — k — > oo, 

and under the assumptions of part (ii) of Theorem A.l, 

(A.12) [q(n- k)} 1 " 211 s 2 Kn ^ c 2 H asn-k^oo. 

Relations (A. 11) and (A.12) are used, respectively, in the proofs of Lemmas 
B.3 and C.2. 
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APPENDIX B 



Proof of Theorem 2.1. Theorem 2.1 will follow immediately from Lem- 
mas B.l, B.2 and B.3 which are stated and proved below. 

In this section we assume that the observations follow the change-point 
model (2.1) and that (2.18) holds. 

We will extensively use the relation 



(B.l) 



o P (l), 



which follows from assumptions (2.19) and (2.20). 
Lemma B.l. // (2.19) and (2.20) hold, then 



n l l 2 max 

Kk<k 



k 



E x-i E * 



Ki<k 



k 



(B.2) 



and 



(B.3) 



n max 

Kk<k 



Ki<k 



k 



E Yi-z E * 

l<i<k K Kj<fe 



+ P (1) 



n x / 2 max 

fc<fc<n 



E ^ 

k<i<k 



k-k 



E * 



n - k t ' . 

k<i<n 



n l l 2 max 

k<k<n 



E * 



k-k £ y _ 



+ o P (l). 



Proof. We can assume that /i = 0. Since the verification of (B.3) is very 
similar to that of (B.2), we present only the proof of (B.2). 
If k<k*, 

E *-t E *= E y-\ e Yi 

l<i<k fc !<i<fc l<i<fc fe i<j<fc 

for all 1 < k < k, so (B.2) holds trivially. If k* < k, then 

j E * = \ E y*+Ai(fc-n 

k ~ k * k 

Ki<k Ki<k 



and 



E * 

Ki<k 



E if ±<k<k*, 

<i<k 

E *i + (k - k*)A, if k* <k<k. 



Ki<k 
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n max 

Kk<k 



k 



E *-? E x 



Ki<k 



k 



<2n- 1/2 A\k-k* 



l<i<k 

2A 2 \k-k*\ 



n x / 2 max 

Kk<k 



E *-t E * 

l<i<fc i<i<fc 



AnV2 ' 

so (B.2) follows from assumptions (2.20) and (2.19). □ 

Lemma B.2. If (2.3) and (B.l) ZioZd, i/ten the sequence of random vec- 
tors 



Ki<k 



k 1 ^ 2 max 
i<fc<fc 

(n — A;)" 1 / 2 max 

fc<fc<n 



k 



E ^-7 E 1? 



k 



Ki<k 



E E y 



n — k ?' . 

k<i<n 



k<i<k 

converges in distribution to the random vector 

( sup \B^(t)\, sup \B^(t)\), 

\0<t<l 0<t<l / 

where and B^ are independent Brownian bridges. 

Proof. By (2.6), 



max 

Kk<k 



E Y-j E y 



Ll<i<k 



k 



Ki<k 



an 



1/2 



W n [ - 



k\ k 



nj k 



7% - 



n 



(B.4) 

= o P (n l l 2 ). 

Using (B.l) and the continuity of the Wiener process, we get 
(B.5) \W n (k/n)-W n (6)\=o P (l). 
Hence, 



(B.6) max 

Kk<k 



W„ 



k\ k. 



n 



k 



sup 

o<t<e 



Wnit) - -W n (0) 



+ o P (l) 



and consequently, by (B.4), 



k 1 t 2 max 

Kk<k 



E y.-i E *s 



Ki<k 



(B.7) 



KKfc 



1/z o<t<e 



W n (t) - ^W n {6) 



+ Op(l). 
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Similar arguments give 

* -1/2 

(n — k) max 

k<k<n 



(B.f 



E ^-^i E * 

f . n — k ~ _ 

k<i<k k<i<n 

{W n {t)-W n {6)) 
t 



1 



(W n (l)-W n (0)) 



+ o P (l). 



Since 1 / 2 W n (0t), < t < 1, is a Wiener process, 



(B.9) 



31/2 



sup 

0<t<6» 



W n (t) - -W„(0) 



sup |5 (1) (*)|, 

0<t<l 



where -B^ is a Brownian bridge. Similarly, there is a Wiener process W(t), < 
t < 1 , such that 



(B.10) 



71 — 7fu72" sup 
(1 - 6y z e< t <i 

d t-6 



(W n (t)-W n (9)) 



sup 
i — f e<t<i 



1 — fl Sup 

1 — u o<t<l- 



W(t-0) 
W(t)- 



1-6 
t 



W(l 



-{W n (l)-W n {6)) 



W(l-0) 



sup \B^(t)\, 

0<t<l 



where B^> is another Brownian bridge. The claim thus follows by com- 
bining (B.7), (B.8) and (B.9), (B.10) and using the independence of the 
increments of a Wiener process. □ 

Lemma B.3. Suppose Assumption 2.1, (2.19), (2.21), (2.22), (2.23) and 
(B.l) hold. Then 

Sn.i ~^ o~ and s n 2 — > o~. 



Proof. Following the proof of Proposition D.l, we get 



s n,l - E 

Km<5 L 



70m,l + 2 Uj(q(k))j jm ,l 
l<j<q(k) 



■ E < 

Km<5 



'nm.l ! 



where 



1 



l<i<fe-j 
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7j2,l 
7j3,1 

7j4,l 
and 



(k*-j) 



k-k* 



A 



.k-k* k* 
1 — 

k n 



A z + {k-j -k*)[^A 



k 



k l<i<k*~j n 



1 



^ k*-j<i<k* 



A 



= i E [(ii-n) + (^+i-^)]x A - 



A- 



k* <i<k—j 



k 



Since s 2 — > <7 2 a.s. by part (i) of Theorem A.l and A; — > oo by (B.l), Theo- 
rem 7.1.1(c) on page 252 of [12] yields that 

(B.ll) s 2 nlA ^a 2 . 
Next we show that 

(B.12) sl m l = p(l) for m = 2, 3, 4, 5. 

As we have seen in the proof of Proposition D.l, 



OP T 



n\ l l 2 q(k)i 



kJ nV2A 



o P (l), 



sl mtl = P {k^\{k)A) = P {^^ 
proving (B.12). 

To prove s 2 2 — > <r 2 , we can apply the same argument, upon observing that 
by Remark A.l, for all < r < 1, we have 

max \s kn — a \ — ► a.s., 

rn<k<n ' 



where s\ n is defined in (A. 10). □ 
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Proof of Theorem 2.2. Theorem 2.2 will follow directly from Lemmas 
C.l and C.2 below. We can assume that [i = 0. 
Let 



1 



Z n i(t) = —ft max 



n H Kk<nt 



E E x 

Ki<k Ki<nt 



and 



Zn2{t) = —rT max 

n H nt<k<n 



E Xi-—, E x 

T^^i n-nt f-r 1 , 

nt<i<k nt<i<n 
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Similarly, let 



Zi(t) = c H sup 

0<s<t 



W H (s) - -W H (t) 



and 



Z2{t) =c H sup 

t<s<l 



{W H {s)-W H {t)) 



s-t 
1-t 



(W H (1) - W H {i)) 



where Wh is defined in Assumption 2.2. 



Lemma C.l. Suppose that (2.7) and (2.8) hold. Then 



(C.l) 



(k/n, Z nl (t), Z n2 (t)) A (£, Zi(t), Z 2 (t)), 



where £ is defined by (2.24). T/ie vectors in (C.l) ta&e values in (0,1) x 
D[0,1] x D[0,1]. 

Proof. The vector (k/n,Z n x(t),Z n 2(t)) is a continuous mapping of 
J2i<k<nt Xk, < i < 1}. The same mapping transforms Wu(t) into 
(£, ^i(i), Z2(t)). Hence, the statement of the lemma follows from the contin- 
uous mapping theorem. □ 



Lemma C.2. We assume that the conditions of Theorem 2.2 are satis- 
fied. Then 



(C.2) 

and 

(C.3) 



[ q {n-k)r™sl 2 ^cl. 



Proof. We first verify (C.2). By part (ii) of Theorem A.l, for any < 
r < 1, 



sup | [q{k)\ 

k>rn 



1-2H „2_ 2i". n 



For any < r < 1 which is a continuity point of the distribution function of 
£ we have 

]im S npP[\[q(k)} 1 - 2H sl-c 2 H \>e} 



< limsupP[/c/n < r] + limsupP 

n— >oo n— >oo 

= P{£,<r). 
Since ?((<r)^0asr^ 0, (C.2) follows. 



sup Mk)} 1 ' 211 si -c 2 H \> 8 

k>rn 
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To prove (C.3), note that by (A.12), 

sup l^-^^-^l-O. 

fc<(l— r)n 

Relation (C.3) is then established using a limsup argument as above and 
the fact that P(f >?*)->• as r->l. □ 

APPENDIX D 

Proof of Theorem 3.2. Observe that by Assumption 2.1 and (2.18), 



—pr max 

1 

> 



Ki<k 



Ki<n 



nV2 

1 



Ki<k* 



Ki<n 



n 



1/2 



Ki<k* Ki<n 



>l n V*6(l-0)\*\-O P {l), 



as n — > oo. Hence, it suffices to show that 

7lV2|A| P 



OO, 

which, in view of (2.19), will follow if we show that s n = Op(l), which is 
verified in the following proposition. 



Proposition D.l. Suppose model (2.1) is valid. Consider the estimator 
s^ defined by (3.2). Suppose Assumption 2.1 and (3.4), (2.18)-(2.21) hold. 
Then 



(D.l) 



4 = o P (i). 



Proof. Denoting Vij = (Xi - X n )(X i+ j — X n ), observe that 



Vij = (Yt - Y n )(Y i+j - Y n ) - (Y - Y n )^-^A 



(Yi+j - Y n ) 



n — k* „ f n — k 



-A + 



n 



n 

* \ 2 



-A 



n 



if l<i<i + j<k*, 



Vij = (Y - Y n )(Y i+j - Y n ) + (Y - Y n ) — A 
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+ (Yi+j ~ Y n \ 



n 



-A— A if l<i<k* <i + j, 



n n 



k* 



Vij = (Yi - Y n ){Y i+j - Y n ) + (Yi - Y n )—A 



(Yi+j - Y n ) 



k* 



„* \ 2 



+ -A 



n 



if k* < i < i + j. 



Therefore, for any < j < q, 



where 



li= E ^jm, 

Km<5 



%i = - E {Y-Y n )(Y i+J -Y n ) 



7j2 
7j3 



l<i<n— j 



n — k 



* \ 2 



A — j 



n n 



* \ 2 



s *A 2 + (n-i-F)f-A 



■1 2 [(y i -F n ,) + (y l+J -y n )] ! ^A, 

n i<j<fc*_j n 

7/4=- E [(^-^--(n+i-yn)— 



fc* —j<i<k* 



A 



and 



7i5 = - E " ^n) + (Y +J - Y n )] — A 



k* <i<n—j 



Consequently, 



# = E 



Km<5 



70m + 2 ^ Ujtqjjjm 
l<j<q 



E - 

Km<5 



.2 P „2 
'nl 



By (3.5), 
(D.2) 

Hence, (D.l) will follow if we verify that 

(D.3) s 2 nm = P {l) for m = 2, 3, 4, 5. 

In order to verify (D.3), we will often appeal to the two elementary relations 

(D.4) 2 £ 

i<i<? 
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2 2 ju^-h 2 - 

!<3<q 



Relation (D.3) is easy to verify for m = 3,4,5. By (2.6) and (D.4), 



Op(n- 1 / 2 qA) = P [ -\ 



qA 2 



n 



-l 2 A 



o P (l) 



on account of (2.19) and (2.21). It thus remains to establish (D.4) for m = 2. 
Since there are three terms in the definition of 7^2 , we may write 



s 2 1 

b n21 1 



qA 2 



k* ( n — k* 



n \ n 



+ 2 ]T Uj (q) 
i<i<9 



k* — j\ ( n — k 



n 



n 



(D.6) 



e(i-ef + 2 £ ^{qm-ef 

l<j<q 

2 

n 



J2 m(?)(i-«) s 
i<j<<? 



1 n 2 
6 n 



9(1 



Similarly, 
(D.7) 



and 



(Di 



°n22 

qA 2 



s 2 1 

b n23 _ ± 

qA 2 q 

1 

Q 
1 



q n n n 



i<i<<? 



n \ n 



i<i<9 



(1-0)02 + 2 ^ Wi (g)(l-5)e 2 -- X] j Wi (g)6/ 



i<i<g 



i<i<g 



;i + g )(l_0)^ *_ 
n 



:i 



Putting together relations (D.6), (D.7) and (D.8), we obtain (D.3) for m = 2. 
This completes the proof of Proposition D.l. □ 



Remark D.l. Proposition D.l and, therefore Theorem 3.2, remain valid 
if the Bartlett weights (2.14) are replaced by any weights satisfying (D.4) 
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and Y^KjKqj^jil) = 0{q 2 ) in addition to the following conditions which are 
needed for (D.2) to hold: 0Jj(q) = for \j\ > q, < Uj{q) < 1, and 

(D.9) lim Uj(q) = 1 for each j; 

see Remark 1.2 in [8]. 
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